The Brier score does not evaluate the clinical utility of diagnostic tests or prediction models

نویسندگان

  • Melissa Assel
  • Daniel D. Sjoberg
  • Andrew J. Vickers
چکیده

Background: A variety of statistics have been proposed as tools to help investigators assess the value of diagnostic tests or prediction models. The Brier score has been recommended on the grounds that it is a proper scoring rule that is affected by both discrimination and calibration. However, the Brier score is prevalence dependent in such a way that the rank ordering of tests or models may inappropriately vary by prevalence. Methods: We explored four common clinical scenarios: comparison of a highly accurate binary test with a continuous prediction model of moderate predictiveness; comparison of two binary tests where the importance of sensitivity versus specificity is inversely associated with prevalence; comparison of models and tests to default strategies of assuming that all or no patients are positive; and comparison of two models with miscalibration in opposite directions. Results: In each case, we found that the Brier score gave an inappropriate rank ordering of the tests and models. Conversely, net benefit, a decision-analytic measure, gave results that always favored the preferable test or model. Conclusions: Brier score does not evaluate clinical value of diagnostic tests or prediction models. We advocate, as an alternative, the use of decision-analytic measures such as net benefit. Trial registration: Not applicable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Utility of the Atopy Patch Test in the Diagnosis of Allergic Rhinitis

Introduction: The diagnostic work-up of allergic rhinitis (AR) is first and foremost based on the combination of clinical history data and results of skin prick tests (SPT). Other tests, including specific IgE measurement, nasal challenge, and, as a third option, component resolved diagnosis or basophil activation test, may be useful when the diagnosis is difficult because of polysensitization ...

متن کامل

Contrastive analysis of diagnostic tests evaluation without gold stand-ard: review article

Considering the advancement of medical sciences, diagnostic tests have been developed to distinguish patients from healthy population. Therefore, Determining and evaluation of the diagnostic accuracy tests is of great importance. The accuracy of a test under evaluation is determined through the amount of agreement between its results with the results of the gold standard, and this test accuracy...

متن کامل

مقایسه عملکرد مدل کاکس و روش K ـ نزدیکترین همسایگی در تخمین بقای بیماران پیوند کلیه

Introduction & Objective: Cox model is a common method to estimate survival and validity of the results is dependent on the proportional hazards assumption. K- Nearest neighbor is a nonparametric method for survival probability in heterogeneous communities. The purpose of this study was to compare the performance of k- nearest neighbor method (K-NN) with Cox model. Materials & Methods: This ...

متن کامل

Calibration of Bar-Concrete Bond Stress Relationships for Bond Stress Prediction of GFRP Soil Nails Using Experimental Pullout Tests

Even though steel bar is a conventional reinforcement in soil stabilization systems, the problem of corrosion of steel may lead to vast damages especially in aggressive environments. In the past decades, Fiber Reinforced Polymer (FRP) materials have offered an effective solution to overcome the corrosion problem. Despite numerous bond stress-displacement models for reinforcements in concrete, t...

متن کامل

Adding propensity scores to pure prediction models fails to improve predictive performance

Background. Propensity score usage seems to be growing in popularity leading researchers to question the possible role of propensity scores in prediction modeling, despite the lack of a theoretical rationale. It is suspected that such requests are due to the lack of differentiation regarding the goals of predictive modeling versus causal inference modeling. Therefore, the purpose of this study ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017